feat: migrate Qwen3 reasoning parser to token-level OutputRouter (closes #64) by Ai-chan-0411 · Pull Request #97 · raullenchai/Rapid-MLX

Ai-chan-0411 · 2026-04-11T15:34:58Z

Closes #64

Summary

Migrate Qwen3's <think>/</think> reasoning parser from regex-based BaseThinkingReasoningParser to the token-level OutputRouter state machine.

Changes

vllm_mlx/output_router.py

Add <think>/</think> token handling in feed() — think_start enters THINKING state, think_end switches to CONTENT
Add Qwen3/DeepSeek auto-detection in from_tokenizer() via <think> + </think> vocabulary entries
Token map already had think_start/think_end fields (added for future migration) — now wired up

tests/test_output_router.py

Add Qwen3 vocabulary fixtures and qwen3_router fixture
Add TestQwen3ThinkRouting class with 9 tests:
- <think> enters THINKING state
- </think> switches to CONTENT state
- Tokens between tags routed to REASONING channel
- Content tokens after </think> routed to CONTENT channel
- Full <think>reasoning</think>content sequence via feed_sequence()
- Implicit thinking (no <think>, only </think>) handled
- No-tag output passes through as pure content
- Control tokens (bos/eos) suppressed
- Reset clears thinking state
Add test_qwen3_detected in TestFromTokenizer

31 tests total, all passing.

Design

Follows the same pattern as the existing Gemma 4 implementation — token-level state machine with no text-level regex matching. This eliminates partial-token split issues that affect the current regex-based parser.

The detection uses <think> + </think> in the tokenizer vocabulary, which covers both Qwen3 and DeepSeek R1 model families.

raullenchai#64) Add <think>/<\/think> token detection and routing to OutputRouter, replacing fragile regex-based text matching with token-level state machine transitions. Supports explicit, implicit, and no-tag scenarios. - Detect Qwen3/DeepSeek tokenizers via <think> + </think> vocab entries - Route tokens between <think>/<\/think> to REASONING channel - 9 new tests covering all Qwen3 scenarios (31 total, all passing)

Ai-chan-0411 · 2026-04-11T21:23:23Z

Closing as maintainer has not reviewed after 168h. Thank you for the opportunity!

Ai-chan-0411 closed this Apr 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: migrate Qwen3 reasoning parser to token-level OutputRouter (closes #64)#97

feat: migrate Qwen3 reasoning parser to token-level OutputRouter (closes #64)#97
Ai-chan-0411 wants to merge 1 commit intoraullenchai:mainfrom
Ai-chan-0411:feat/qwen3-output-router

Ai-chan-0411 commented Apr 11, 2026

Uh oh!

Ai-chan-0411 commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ai-chan-0411 commented Apr 11, 2026

Summary

Changes

Design

Uh oh!

Ai-chan-0411 commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant